Identifying and Improving Dataset References in Social Sciences Full Texts
نویسندگان
چکیده
Scientific full text papers are usually stored in separate places than their underlying research datasets. Authors typically make references to datasets by mentioning them for example by using their titles and the year of publication. However, in most cases explicit links that would provide readers with direct access to referenced datasets are missing. Manually detecting references to datasets in papers is time consuming and requires an expert in the domain of the paper. In order to make explicit all links to datasets in papers that have been published already, we suggest and evaluate a semi-automatic approach for finding references to datasets in social sciences papers. Our approach does not need a corpus of papers (no cold start problem) and it performs well on a small test corpus (gold standard). Our approach achieved an F-measure of 0.84 for identifying references in full texts and an F-measure of 0.83 for finding correct matches of detected references in the da|ra
منابع مشابه
A semi-automatic approach for detecting dataset references in social science texts
Today, full-texts of scientific articles are often stored in different locations than the used datasets. Dataset registries aim at a closer integration by making datasets citable but authors typically refer to datasets using inconsistent abbreviations and heterogeneous metadata (e.g. title, publication year). It is thus hard to reproduce research results, to access datasets for further analysis...
متن کاملIdentifying Opinion Holders for Question Answering in Opinion Texts
Question answering in opinion texts has so far mostly concentrated on the identification of opinions and on analyzing the sentiment expressed in opinions. In this paper, we address another important part of Question Answering (QA) in opinion texts: finding opinion holders. Holder identification is a central part of full opinion identification and can be used independently to answer several opin...
متن کاملFine-grained German Sentiment Analysis on Social Media
Expressing opinions and emotions on social media becomes a frequent activity in daily life. People express their opinions about various targets via social media and they are also interested to know about other opinions on the same target. Automatically identifying the sentiment of these texts and also the strength of the opinions is an enormous help for people and organizations who are willing ...
متن کاملIdentifying and Explaining the Factors Affecting the Social Participation of the Iranian People in Natural Disasters
Over 2 million people worldwide suffer from life-threatening emergencies and natural disasters, annually. By participating in crisis management processes, different people and various sectors of the society can reduce the country's vulnerability to natural disasters. One of the most important issues in crisis management is the participation of people in all processes of the crisis management cy...
متن کاملA Model for Detecting of Persian Rumors based on the Analysis of Contextual Features in the Content of Social Networks
The rumor is a collective attempt to interpret a vague but attractive situation by using the power of words. Therefore, identifying the rumor language can be helpful in identifying it. The previous research has focused more on the contextual information to reply tweets and less on the content features of the original rumor to address the rumor detection problem. Most of the studies have been in...
متن کامل